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Abstract 

A  novel  network  architecture  was  developed  to  classify 
multiple  successive  echoes  from  targets  ensonified  by  a 
dolphin  echolocating  in  a  naturalistic  environment.  The 
inputs  to  the  network  were  spectral  vectors  of  the  echo 
plus  one  unit  representing  the  start  of  each  scan.  This 
network  combined  information  from  successive  echoes 
from  the  same  target  and  reset  between  scans  of 
different  targets.  The  network  was  trained  on  a  small 
subset  (4%)  of  the  total  set  of  available  echoes  (1,335). 
Depending  on  the  measure  used  to  assess  it,  the  network 
correctly  classified  between  90%  and  93%  of  all  echo 
trains.  In  contrast,  a  standard  backpropagation  network 
with  the  same  number  of  units  and  variable  connections 
performed  with  only  about  63%  accuracy  in  classifying 
echo  trains.  The  integration  model  seems  to  provide  a 
better  account  of  the  dolphin's  performance  than  a 
decision  model  that  does  not  combine  information  from 
multiple  echoes. 

Introduction 

Bottlcnose  dolphins  {lursiops  truncatus)  possess 
a  unique  biological  sonar  which  is  highly  adapted  to 
their  aquatic  environment  (Moore  et  al,  1990)  Using 
this  sonar  the  dolphin  can  readily  identify  many 
characteristics  of  submerged  objects  by  sending  out 
broad-band  high  frequency  clicks  and  processing  the 
returning  echoes  (see  Nachtigall,  1980,  for  a  review) 

The  specific  processes  by  which  the  dolphin 
extracts  acoustic  information  about  the  targets  is 
unknown  and  particularly  interesting  questions  concern 
how  the  animal  performs  feature  extraction  from  a  set  of 
returning  echoes  (Nachtigall  &  Moore,  1988) 

Behavioral  methods 

(^ur  subject  is  a  highly  experienced  male 


bottlenose  dolphin,  housed  in  a  floating  enclosure  in 
Kaneohe  Bay  at  the  Hawaii  Laboratory  of  the  The 
NavalCommand,  Control  and  Ocean  Surveillance  Center 
(RDT&E  Division)  During  the  echolocation  tests  the 
animals'  eyes  are  covered  with  soft  removable  eyecups 
that  occlude  its  vision.  Echolocation  data  were  recorded 
while  the  animal  was  performing  a  delayed  matching-to- 
sample  (DMTS)  object  recognition  task. 

In  this  task,  the  dolphin  must  select  from  a  set 
of  three  alternatives  the  one  target  that  is  the  same  as 
(matches)  a  previously  presented  sample  target.  The 
identity  and  location  of  the  targets  vary  randomly  from 
trial  to  trial,  so  performance  on  this  task  requires  the 
animal  to  recognize  the  sample,  remember  its  identity, 
and  to  recognize  the  matching  target  To  perform  this 
task  the  dolphin  stationed  under  water  in  the  center  of  an 
observing  aperture,  located  directly  in  front  of  the 
sample  target  array.  Three  sets  of  comparison  targets 
were  suspended  in  front  of  the  animal  from  a  bar  located 
4.3  m  from  the  underwater  aperture.  Echolocation 
clicks  were  detected  by  B&K  8103  hydrophones  located 
2  m  from  the  observing  aperture  between  the  aperture 
and  the  targets..  Echoes  from  the  targets  were  recorded 
using  a  custom-built  hydrophone  with  a  flat  response  up 
to  200  kHz.  Recordings  were  made  using  a  RACAL 
store-4  tape  recorder,  with  a  300  kHz  dynamic  range, 
from  which  clicks  and  echoes  were  digitized  at  1  MHz 
Figure  1  is  a  schematic  of  the  testing  configuration. 

The  present  study  used  three  targets  (a)  a  PVC 
plastic  tube  open  at  both  ends  (15  cm  long,  7  5  cm 
diameter,  30  mm  wall  thickness),  (b)  a  water-filled 
stainless  steel  sphere  (5  cm  diameter),  and  (c)  a  solid 
aluminum  cone  (10  cm  diameter  base,  10  cm  height), 
each  presented  approximately  100  cm  below  the  water's 
surface  Four  examples  of  each  target  were  used,  one 
as  sample,  and  the  other  three  as  alternative  comparison 
targets  Each  trial  began  with  the  dolphin  stationed  in 
the  observing  aperture  with  the  acoustic  screen  closed 
One  of  the  sample  targets  xvas  then  lowered  into  the 
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water,  the  screen  was  lowered,  and  the  dolphin  was 
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Figure  J.  A  schematic  of  the  test-pen.  The  animal  is  shown  stationed 
facing  the  acoustic  screen,  measuring  hydrophones  and  the  comparison 
and  the  sample  target  arrays.  T he  hydrophone  used  to  collect  the 
echoes  from  the  targets  was  placed  on  the  nght  side  of  the  animal. 

allowed  to  echolocate  ad  lib.  The  acoustic  screen  was 
then  raised,  the  sample  was  removed  from  the  water  and 
three  alternative  targets  were  then  presented.  The  screen 
was  then  again  lowered  and  the  dolphin  was  allowed  to 
echolocate  on  the  comparison  targets.  The  animal 
indicated  his  choice  by  contacting  a  small  ball  on  the 
end  of  a  response  wand  at  the  water  surface,  and  directly 
in  front  of  each  comparison  target  anay.  The  dolphin's 
choice  accuracy  averaged  nearly  95%  correct. 

Echo  analysis  using  a  counterpropagation  network 

A  selected  sample  of  echoes  collected  from  this 
experiment  was  submitted  to  a  counterpropagation 
network  (Hecht-Nielsen,  1987,  1988,  see  also  Grossberg, 
1976)  trained  to  classify  a  subset  of  these  echoes  into 
categones  corresponding  to  each  of  the  stimuli  (see 
Roitblat,  Moore,  Nachtigall,  Fenner,  &  Au.  1989;  for  a 
description  of  this  work).  This  network  learned  to 
classify  the  spiectral  information  from  the  echoes  with 
considerable  accuracy  above  95’’/o  couect  (including 
novel  exemplars)  Although  the  network  could  identify 
the  target  with  only  a  single  echo,  the  dolphin 
concurrently  performing  the  same  task  emitted  many 
more  clicks  in  identifying  the  same  targets  We  also 
noticed  that  the  dolphin  was  more  variable  in  terms  of 
the  number  of  clicks  and  the  number  of  scans  used  to 
identify  the  correct  match  than  was  predicted  by  a 
sequential  sampling  model  of  his  performance  (Roitblat. 
cl  al  .  1990) 

The  sequential  sampling  model  assumed  that 
ihc  echoes  were  drawn  from  a  siaiionary  distribution. 


which  may  have  been  an  inappropriate  assumption  in 
light  of  the  variability  in  the  dolphin's  click  production. 
Because  of  the  sampling  procedure  (echoes  were  selected 
largely  on  the  basis  of  their  intensity),  the  echoes 
submitted  to  the  neural  network  may  not  have  been 
typical  of  the  population  of  echoes  the  dolphin  actually 
used.  This  possibility  could  have  led  to  an  overestimate 
of  the  ability  of  the  models  to  recognize  targets  on  the 
basis  of  dolphin  echolocation  returns. 

In  response  to  these  considerations  we  extended 
our  analysis  to  include  every  echo  available  to  the 
dolphin.  In  contrast  to  our  previous  studies  concerning 
the  classification  of  echoes,  in  which  echoes  were 
selected  for  inclusion  if  they  were  sufficiently  intense,  in 
the  present  study  we  captured  the  echo  resulting  from 
every  click  the  animal  emitted  in  the  sampled  trials 

The  integrator  gateway  network 

A  new  network  architecture  was  developed  in 
order  to  model  the  dolphin's  extraction  of  information 
from  trains  of  echoes.  The  model  incorporates  the 
assumption  that  the  dolphin  averages  or  sums  spectral 
information  from  successive  echoes  and  continues  to 
emit  clicks  and  collect  returning  echoes  until  it  can 
classify  the  target  producing  those  echoes  with  sufficient 
confidence.  The  inputs  to  this  network  were  patterns  of 
spectral  intensity  (i.e.,  amplitude  in  each  frequency 
band).  The  outputs  of  the  network  were  stimulus 
classes.  One  output  corresponded  to  each  stimulus  class, 
sphere,  cone,  and  tube.  The  resulting  activations  of  each 
of  these  output  classes  were  taken  to  be  an  estimate  of 
the  likelihood  that  the  echo  resulted  from  the  particular 
stimulus  typ>c  (Qian  &  Sejnowski,  1988).  Figure  2 
shows  the  overall  structure  of  the  Integrator  Gateway 
Network. 

Inputs  to  the  network  consisted  of  30  bins  of 
relative  amplitude  spectral  information,  3.91  kHz  per 
bin,  ranging  from  31  25  kHz  to  146.5  kHz  Each  echo 
was  also  marked  as  to  whether  the  echo  was  (1.00)  or 
was  not  (0.00)  at  the  start  of  an  echo  tram.  The  first 
input  to  the  network  contained  the  slarl-of-lrain  marker, 
the  remaining  elements  contained  the  amplitude  of  a 
specified  frequency  range.  The  frequency  inputs  were 
then  passed  to  a  scaler  unit  and  to  the  integrator  layer 

The  integrator  layer  (grey  circles)  also  contained  30 
units,  connected  to  the  frequency  units  in  the  input  layer 
in  a  correspionding  one-to-one  pattern  The  connection 
weights  from  the  inputs  to  the  integrator  layer  were  fixed 
at  1.00.  The  connections  to  the  scaler  unit  were  fixed  at 
1/n,  where  n  is  the  number  of  frequency  inputs  (i  e.,  30) 
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Figure  2.  A  schematic  of  the  Imegrator  Qateu^ay  Network.  The 
bottom  part  of  the  figure  shows  one  echo  in  (he  form  of  reldsve 
amplitude  and  a  stan-of-scan  marker.  EUpses  indicate  that  the  full 
network  contains  additional  units  of  the  same  type. 


The  output  of  the  scaler  unit,  which  was  simply 
the  sum  of  all  of  its  inputs,  was  passed  to  each  of  the 
units  in  the  integrator  layer  via  a  fixed  weight  of  -  I  00. 
The  effect  of  this  scaler  unit  was  to  subtract  the  average 
activity  of  the  input  layer  (neglecting  the  start-of-train 
marker)  from  the  inputs  to  the  integrator  layer. 

The  elements  in  the  integrator  layer  computed  a 
cumulative  (running)  sum  of  the  inputs  they  received. 
One  echo  was  presented  per  time  step.  The  activation  of 
each  unit  in  the  integrator  layer  was  the  sum  of  the 
activation  it  had  during  the  previous  time  step,  plus  the 
activation  it  received  from  the  scaler  unit,  plus  the 
activation  it  received  from  its  respective  input,  plus  the 
activation  of  its  corresponding  gateway  unit.  The  role  of 
the  integrator  layer  was  to  accumulate  and  integrate 
information  from  successive  echoes  The  outputs  of  the 
integrator  layer  were  passed  back  via  fixed  connections 
with  1.00  weights  to  corresponding  units  in  the  gateway 
layer  (open  triangles)  1-ach  unit  in  the  gateway  layer 
acted  as  a  reset  for  the  corresponding  unit  in  the 
integrator  layer 

The  connection  between  each  gateway  unit  and 
Its  corresponding  integrator  unit  was  fixed  at 
-I  00  The  output  of  the  gateway  unit  was  the  product 
of  the  output  of  Its  corresponding  integrator  unit  and  the 
slart-of-scan  marker  The  activation  from  the  gateway 
unit  received  by  the  integrator  unii  consisted  of  the 
product  of  the  conncciion  weight  (-1  00),  the  aciivation 


of  the  siart-of-scan  marker,  and  the  activation  during  the 
previous  time  step  of  the  corresponding  unit  in  the 
integrator  layer  Because  the  marker  had  1.00  activity  at 
the  start  of  a  click  train  and  0.00  activity  otherwise,  this 
marker  allowed  the  gateway  unit  to  function  as  a  reset 
signal,  causing  the  units  in  the  integrator  layer  to  be 
reset  to  0.0  at  the  start  of  every  scan. 

During  each  lime  step,  the  output  of  the  integrator 
layer  also  led  via  variable-weight  connections  to  each  of 
the  elements  in  the  feature  layer.  The  outputs  of  the 
elements  in  the  Feature  Layer  then  led  via  variable- 
weight  connections  to  the  output  or  classification  layer. 
The  elements  in  these  two  layers  contained  sigmoid 
transfer  functions  and  were  trained  using  a  standard 
cumulative  backpropagation  algorithm  (McClelland  & 
Rumelhart,  1988,  Rumcihart,  Hinton.  &  Williams,  1986) 
with  the  epoch  size  set  to  the  number  of  training 
samples  (60). 

The  network  was  trained  with  six  sets  ol  ten 
successive  echoes  selected  from  the  ends  of  haphazardly 
chosen  echo  trains  Two  sets  of  echoes  were  chosen  for 
each  stimulus  in  the  set  The  network  was  trained  with 
declining  learning-rate  parameters  The  network 
converged  to  a  criterion  RMS  output  error  of  0.05  after 
12,300  iterations 
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Figure  J.  Results  of  generalization  testing  of  the  network  in  the  form 
of  the  confidence  of  the  net^’ork  in  assigning  the  echo  tram  (r>  the 
proper  category 


Integrator  gateway  results  and  discussion 

Figure  3  shows  the  results  of  generalization  testing 
of  the  network  The  complete,  original  set  of  1.335 
sequential  echoes  was  presented  to  the  network  and  the 
network  was  allowed  to  classify  each  echo  tram  Figure 
3  shows  the  confidence  of  the  network  in  assigning  ihc 
echo  train  to  the  |iiopci  calegoiA'  as  a  ruiiciion  of  the 
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number  of  echoes  received.  "Confidence"  was  defined 
as  the  ratio  of  the  activation  level  of  the  correct 
classiftcation  versus  the  total  output  of  the  three 
classification  units.  These  confidence  ratios  couespond 
to  intermediate  likelihood  ratios  (Qian  &  Sejnowski, 
1988).  Overall,  the  animal's  performance  is  better  than 
that  of  our  network.  Roitblat.  et  al.  (1990b)  reported 
that  the  dolphin  was  94.5%  correct  at  selecting  the 
correct  match.  This  level  of  performance  required  the 
animal  to  identify  the  sample  correctly  and  to  identify 
comparison  stimuli  correctly.  The  probability  of  both 
occurring  was  observed  to  be  0.945.  Therefore,  on  the 
assumption  that  the  two  identifications  were  independent 
of  one  another,  the  probability  of  identifying  both  is 
simply  the- product  of  the  probabilities  of  identifying 
each  target  individually.  Therefore,  the  probability  of 
each  identification  can  be  estimated  at  p  =  Vo.945  = 

0  972  (assuming  that  each  occurred  with  equal 
probability).  By  no  measure  was  our  network  97.2% 
accurate  at  identifying  the  stimuli,  but  when  it  did 
identify  the  stimuli  it  tended  to  do  so  with  fewer  echoes 
ihan  were  used  by  the  dolphin. 

According  to  our  model,  on  a  substantial  number 
of  trials  the  dolphin  continued  to  emit  ccholocation 
signals  beyond  the  rational  stopping  criterion  prescribed 
by  sequential  sampling  theory,  and  failed  to  emit 
sufficient  clicks  on  one  scan  (the  first  scan  of  tube 
targets).  There  could  be  several  reasons  why  the 
dolphin  continued  to  sample  after  the  network  had 
reached  its  confidence  criterion.  Among  these  arc  the 
[xissibility  that  the  dolphin  considers  a  broader  range  of 
targets  in  making  its  classification.  This  dolphin  was 
highly  experienced  having  served  in  various  forms  of  the 
experiment  with  many  different  targets  for  more  than  5 
years  Although  this  expieriment  was  designed  to  present 
only  the  same  three  targets  at  all  times,  the  dolphin  mav 
have  persisted  in  classifying  the  echoes  relative  to  a 
much  larger  set  of  targets.  More  echoes  may  be 
necessary  to  discriminate  among  this  broader  range  of 
targets 

Another  fwssibility  is  that  the  dolphin  uses  other 
information  besides  that  used  by  the  network  For 
example,  although  the  network  was  trained  to  classify 
targets  on  the  basis  of  relative-amplitude  echo  spectra, 
the  dolphin  may  use  absolute  target  "strength"  or  a 
variety  of  time-domain  features  (Au,  1988)  as 
discriminative  cues 

A  third  possibility  is  that  the  dolphin  may  not  be 
able  to  represent  the  echo  spectra  with  the  same  fidcliiv 
that  was  available  to  the  network  The  dolphin  may 
occasionally  "forget"  or  fail  to  attend  to  some  of  the 
echo  information  Wc  also  time-windowed  the  echoes 
and  thereby  focused  the  network's  "attention" 
specifically  on  these  intervals  1  he  dolplnn  mas  noi  be 


capable  of  such  rigid  timing  and  may  need  to  emit  some 
clicks  simply  to  determine  target  distance  in  order  to 
extract  other  information. 

The  final  possibility  that  has  occurred  to  us  is  that 
the  dolphin  may  not  have  been  as  task-focused  as  the 
neural  network.  The  echoes  were  collected  in  a  natural 
environment  containing,  for  example,  many  moving  fish, 
other  dolphins,  etc.  It  is  possible  that  at  least  some  of 
the  clicks  may  have  been  directed  at  targets  other  than 
those  presented  explicitly  by  the  expierimenters,  or  that 
the  dolphin  continued  to  click  at  the  target  while  actually 
attending  elsewhere. 

A  simple  backpropagation  network 

The  architecture  of  the  integrator  gateway  network 
is  substantially  more  complicated  than  that  of  some  more 
standard  networks  architectures.  By  way  of  comparison, 
therefore,  we  trained  a  backpropagation  network  on  the 
same  data  in  order  to  determine  whether  this  additional 
structure  contributed  to  the  performance  of  the  network 
The  backpropagation  network  contained  exactly  the  same 
number  of  inputs,  hidden  units,  outputs,  and  adjustable 
connections  as  the  integrator  network.  The  only 
difference  between  the  networks  was  the  presence  of  the 
integration  apparatus  in  the  integrator  network  and  its 
absence  in  the  backpropagation  network.  The 
backpropagation  network  was  trained  to  the  same  0.05 
RMS  error  criterion  using  the  same  training  parameters 
and  then  tested  on  the  full  set  of  echoes 

Backpropagation  results 

Figure  4  shows  the  confidence  of  the  network  in 
assigning  the  echo  train  to  the  proper  category  as  a 
function  of  the  number  of  echoes  received.  Compared  to 
the  categorization  performance  of  the  integrator  network, 
the  backpropagation  network  was  much  more  variable 
Whereas  the  integrator  network  was  trained  to  recognize 
integrated  combinations  of  echoes,  the  backpropagation 
network  was  trained  to  recognize  individual,  independent 
examples  of  echoes 

As  Figure  4  illustrates,  the  individual  echoes  were 
highly  variable,  and  frequently  as.signed  to  an  erroneous 
category 

fhese  data  suggest  that  the  integrator  network 
added  significantly  to  the  ability  to  classify  sequentially 
produced  echoes  In  other  words,  by  implementing  a 
signal  "averaging"  mechanism  in  the  neural  network  we 
•illowcd  the  system  to  take  advantage  of  the  redundancy 
inherent  in  the  use  of  multiple  echoes  from  the  same 
source  and  in  the  stochastic  properties  of  the  noise  in 
winch  those  echoes  aie  embedded 


In  contrast,  the  backpropagation  network  was 


Sphere 


Successive  echoes 


Figun  4.  Confidence  of  the  backpropagation  network  in  assigning  the 
echo  to  train  to  the  proper  ctMagory  as  a  function  of  the  number  of 
echoes  reerved.  is  the  rrumberof  trains  classified  from  each  target. 

required  to  process  not  only  the  characteristics  of  the 
echoes  themselves,  but  also  the  characteristics  of  the 
noise.  This  results  in  many  spurious  classifications. 
Presumably,  if  a  larger  training  set  had  been  employed, 
the  backpropagation  network  would  have  learned  to 
"abstract"  the  salient  properties  of  the  echoes,  but  within 
the  constraints  of  a  relatively  small  training  set  (60  of 
1,335  or  just  4%  of  the  total  number  of  echoes),  the 
integrator  network  does  a  much  better  job  of  separating 
the  signal  from  the  noise. 

The  gateway  integrator  network  adds  a  level  of 
complexity  to  the  standard  backpro|>agation  network 
architecture  that  contributes  substantially  to  its 
performance.  Its  design  is  inspired  by  properties  of  the 
dolphin's  performance  and  it  represents  one  step  along  a 
development  path  that  seeks  to  include  more  of  the 
mechanisms  that  we  can  identify  from  the  neurobiology 
of  echolocation  (e.g.,Suga,  1990)  and  from  the 
performance  of  dolphins  in  their  aquatic  environment. 
Although  the  results  of  the  present  study  do  not  prove 
that  dolphins  perform  similar  integration,  this  integration 
model  seems  to  provide  a  better  account  than  a  decision 
model  that  does  not  integrate. 
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